Adaptive model-based speech enhancement
نویسندگان
چکیده
Declaration This dissertation is the result of my own work and includes nothing which is the outcome of work done in collaboration except where stated. It has not been submitted in whole or part for a degree at any other university. The length of this thesis including footnotes and appendices is approximately 37000 words. i Summary This dissertation details the development and evaluation of techniques to enhance speech corrupted by unknown independent additive noise when only a single microphone is available. It therefore seeks to address a deeciency of many speech enhancement systems which require a priori knowledge of the interfering noise statistics. Such a deeciency must be corrected if these systems are to operate in real world situations. The enhancement systems developed are based on an existing system by Ephraim (Ephraim 1992a). This approach models the speech and noise statistics using autoregressive hidden Markov models (AR-HMMs). Two main extensions to this technique are developed in order to make it adap-tive. The rst estimates the noise statistics from detected pauses. The second forms maximum likelihood estimates of the unknown noise parameters using the whole utterance. Both techniques operate within the AR-HMM framework. Additional work in this dissertation improves the modelling power of AR-HMM systems by incorporating perceptual frequency. The bilinear transform is used to warp the frequency spectrum of the feature vectors to an approximation of the Bark scale. This modiication can be incorporated into both AR-HMM recognition and enhancement systems. The enhancement techniques are evaluated on the NOISEX-92 and Resource Management (RM) databases, giving indications of performance on simple and more complex tasks respectively. Additional experiments investigating the incorporation of perceptual frequency into AR-HMM systems were conducted on the E-set of the speaker independent ISOLET database. Both enhancement schemes proposed were able to improve substantially on baseline results. The technique of forming maximum likelihood estimates of the noise parameters was found to be the most eeective. Its performance was evaluated over a wide range of noise conditions ranging from-6dB to 18dB and on various types of stationary real-world noises. The incorporation of perceptual frequency into AR-HMM systems was found to increase recognition performance substantially on both the ISO-LET and RM databases. The improvement was less marked for the more complex task, highlighting that AR-HMMs could beneet from the inclusion of more variance information. Acknowledgements First I would like to thank my supervisor Tony Robinson. He provided me with the …
منابع مشابه
Speech Enhancement by Modified Convex Combination of Fractional Adaptive Filtering
This paper presents new adaptive filtering techniques used in speech enhancement system. Adaptive filtering schemes are subjected to different trade-offs regarding their steady-state misadjustment, speed of convergence, and tracking performance. Fractional Least-Mean-Square (FLMS) is a new adaptive algorithm which has better performance than the conventional LMS algorithm. Normalization of LMS ...
متن کاملA New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain
Quality of speech signal significantly reduces in the presence of environmental noise signals and leads to the imperfect performance of hearing aid devices, automatic speech recognition systems, and mobile phones. In this paper, the single channel speech enhancement of the corrupted signals by the additive noise signals is considered. A dictionary-based algorithm is proposed to train the speech...
متن کاملSpeech Enhancement using Adaptive Data-Based Dictionary Learning
In this paper, a speech enhancement method based on sparse representation of data frames has been presented. Speech enhancement is one of the most applicable areas in different signal processing fields. The objective of a speech enhancement system is improvement of either intelligibility or quality of the speech signals. This process is carried out using the speech signal processing techniques ...
متن کاملSpeech enhancement based on hidden Markov model using sparse code shrinkage
This paper presents a new hidden Markov model-based (HMM-based) speech enhancement framework based on the independent component analysis (ICA). We propose analytical procedures for training clean speech and noise models by the Baum re-estimation algorithm and present a Maximum a posterior (MAP) estimator based on Laplace-Gaussian (for clean speech and noise respectively) combination in the HMM ...
متن کاملUtilizing Kernel Adaptive Filters for Speech Enhancement within the ALE Framework
Performance of the linear models, widely used within the framework of adaptive line enhancement (ALE), deteriorates dramatically in the presence of non-Gaussian noises. On the other hand, adaptive implementation of nonlinear models, e.g. the Volterra filters, suffers from the severe problems of large number of parameters and slow convergence. Nonetheless, kernel methods are emerging solutions t...
متن کاملSpeech Enhancement using Laplacian Mixture Model under Signal Presence Uncertainty
In this paper an estimator for speech enhancement based on Laplacian Mixture Model has been proposed. The proposed method, estimates the complex DFT coefficients of clean speech from noisy speech using the MMSE estimator, when the clean speech DFT coefficients are supposed mixture of Laplacians and the DFT coefficients of noise are assumed zero-mean Gaussian distribution. Furthermore, the MMS...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Speech Communication
دوره 34 شماره
صفحات -
تاریخ انتشار 2001